AITopics | hessian aware trace-weighted quantization

Collaborating Authors

hessian aware trace-weighted quantization

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks

Neural Information Processing SystemsDec-24-2025, 17:11:50 GMT

Quantization is an effective method for reducing memory footprint and inference time of Neural Networks. However, ultra low precision quantization could lead to significant degradation in model accuracy. A promising method to address this is to perform mixed-precision quantization, where more sensitive layers are kept at higher precision. However, the search space for a mixed-precision quantization is exponential in the number of layers. Recent work has proposed a novel Hessian based framework, with the aim of reducing this exponential search space by using second-order information. While promising, this prior work has three major limitations: (i) they only use a heuristic metric based on top Hessian eigenvalue as a measure of sensitivity and do not consider the rest of the Hessian spectrum; (ii) their approach only provides relative sensitivity of different layers and therefore requires a manual selection of the mixed-precision setting; and (iii) they do not consider mixed-precision activation quantization.

hessian aware trace-weighted quantization, neural network, quantization, (9 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.60)
Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback

Review for NeurIPS paper: HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks

Neural Information Processing SystemsAug-16-2025, 16:29:59 GMT

Summary and Contributions: This paper suggests that Hessian trace can be a good metric to automate the process to decide the number of quantization bits for each layer unlike previous attempts such as using top Hessian eigenvalue. Some mathematical analysis to support that Hessian trace is better than top Hessian eigenvalue is provided while memory footprint and mode accuracy are compared on several models using ImageNet database. This paper also shows that Hessian trace computations can be simplified by following the Hutchinson's algorithm. Strengths: - Hessian-related metrics have been widely adopted to present different sensitivity of layers. This paper compares a few different Hessian-related approaches and provides some mathematical analysis to claim why Hessian trace can be considered as a good metric to produce some optimal number of quantization bits.

hessian aware trace-weighted quantization, hessian trace, top hessian eigenvalue, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.41)

Add feedback

Review for NeurIPS paper: HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks

Neural Information Processing SystemsFeb-6-2025, 19:48:22 GMT

This paper on the surface makes a small change of an existing approach to base neural network weight quantization on the Hessian trace instead of max eigenvalue. This is motivated by the observation that the Hessian should be positive definite if the network is optimized to a local optimum. This change, although superficially minor compared with using the largest eigenvalue (as proposed in [7]) leads to the ability to apply Hutchinson's algorithm to compute an efficient sample based approximation to the trace, which leads to large computational speedups and modest performance improvements. The reviewers were unanimous that the paper is at least marginally above the acceptance threshold.

hessian aware trace-weighted quantization, neural network, neurips paper, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks

Neural Information Processing SystemsOct-11-2024, 11:51:09 GMT

hessian aware trace-weighted quantization, neural network, quantization, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.63)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.63)

Add feedback